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A SPECIAL APPLICATION OF PARTIAL CORRELATION* 
By W. L. Crtjm, WHliamstown, Massachusetts 



It is well known that if two ordered statistical series — for instance, 
two historical economic series — are subject to rectilinear trends, their 
correlation coefficient is influenced by that fact. Since it is ordinarily 
desired to determine the degree of correlation which is independent of 
the normal trends, it becomes necessary to separate in each series the 
variation due to trend from the miscellaneous variation. A common 
methodf of accomplishing this consists in determining the line of trend 
and correcting the items of the given series by corresponding amounts. 
In other words, the plan is to study the deviations of the items from the 
line of trend rather than the horizontal. The object of this paper is to 
interpret this process in terms of the partial correlation coefficient and 
to call attention to a corresponding direct method of eliminating the 
secular trend. 

I. STANDABD DEVIATION 

Suppose that a single series of N items, x t , is ordered relative to a 
particular variable which, although it need not be the time, will be 
represented by t; and assume that the series has a trend which is sub- 
stantially rectilinear. It is sought to find this line of trend and to 
determine the standard deviation of the x, relative to it. From among 
the methods of locating the line, we adopt that of least squares. 

Let the equation of the line be 

x = U+a 
and let & be the residual obtained by diminishing x t by the above value 

*Read before the American Mathematical Society on September 9, 1921, under the title "The 
significance of the partial correlation coefficient in the comparison of ordered statistical series possessing 
rectilinear trends." 

t Persons, Warren M., "An Index of General Business Conditions," in particular Part II ("Method 
used"), in Review of Economic Statistics, April, 1919. 
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of the ordinate x for t equal to *,-. Then the function which is to be 
made minimum is 

2£ 2 <=2(a; t — bU- a) 2 

= 2 (x { - a) 2 - 2b2ti(Xi - a) + V2t 2 i 

and the conditions furnishing a minimum are 

a=X-bT, where X= — 2a;,-, and T=-2ii (1) 

together with: 

, _ 2fr(a;,-— q) ,_., 

2« 2 ,- 

The first of these conditions expresses the fact that the line passes 
through the point having as coordinates the arithmetic means of the 
ti and Xi, and the second gives the slope of the line of best fit. 
The corresponding minimum value of the function is 

2? < = Zfa- a) 2 -6 2 2i 2 ,- = Sfa- XY -6 2 (2i 2 i -NT*) 
giving 

o^o^-oV-SV, (3) 

where <r x is the ordinary standard deviation of the x { , and o-j is the 
standard deviation of the residuals. In other words, a x is the standard 
deviation of order zero, and oj— or, <s xX — is that of the first order.* 

It is evident that 6V t can never exceed c 2 x , and can equal it only in 
case the original distribution is precisely rectilinear; and, moreover, 
b 2 <r 2 t can never be negative, and is zero only if the trend is horizontal. 
Hence, a xi is zero only if the actual distribution is exactly rectilinear, 
and has its maximum value of <r x only in case the trend is horizontal. 

It is suggested that for certain purposes the coefficient a xi is a better 
measure of fluctuation than a x . Whereas a x is ordinarily used to 
measure dispersion, a xi indicates the degree of divergence of the items 
from the line of normal trend. The difference between the squares of 
the two, 6V t , is the square of the standard deviation of the ordinates 
of the line of trend. In calculating <j xi no correction for trend need 
actually be made in the given items: the value is obtained directly by 
use of equation (3), and the work is particularly easy if the t t are suc- 
cessive integers. 

Since the coefficient of correlationf between x and t is defined by 

r« = b?i (4) 

* Yule, G. U., An Introduction to the Theory of Statistics, chap, xii, sec. 6; 5th ed., London; Charles 
Griffin & Co. 1919. 
t Ibid., chap, ix, sec. 10. 
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equation (3) may be written in the form * 

c%. t = <j\{\-r^. (5) 

II. CORRELATION 

We address ourselves next to the examination of the correlation 
between one ordered series, x it having rectilinear trend denned by the 
parameters a x and b x , and a second, y it with trend defined by a v and 
b y . Let it be supposed at the start that the correlation is not subject 
to any lag. 

The equations of the two lines of trend are 
x = b x t+a x , y = b y t+a y 

and, if £,• and m are the residuals obtained by diminishing the ith. items 
of the two series by the ith ordinates of the respective lines of trend, 

fc = Xi — b x ti — a x , r]i = y t — byU — a v . 

It was remarked at the beginning that the correlation sought is that 
which exists between these residuals, namely, r&,. Its calculation 
involves the product sum Np, where 

= ^{x i -b x t i -a x ){y i -b y t i -a y ) 

= -2(x i -a x )(y i -a v )-b x b y 2t 2 i 

= 2(x,-X)(j/,- Y) -b x b v (2t\-NTP) 

= (r xu -r xt r yt )Nff x (T y 

by use of (1), (2), (4). Furthermore, using (4) and (5) with this 
result, 

r = **»-**« = (6) 

We find, therefore, as might have been foreseen, that the correlation 
between £,• and 17,-, as ordinarily calculated by "correcting" the vari- 
ates Xf and y it is merely the partial correlation! of a;,- and j/<, independent 
of t. Thus, in calculating the correlation between & and %, we are 
relieved of the necessity of correcting for trend: the arithmetic involved 
in computing the residuals & and rn is eliminated entirely. The sim- 
pler process is to use formula (6) directly: the coefficients r n and r yt 
must be evaluated anyway in order to locate the lines of trend, and the 
labor of determining r xy is no greater than that of finding r^. 

*Ibid., chap, iz, sec. 14; and chap, xii, sec. 3. 
t Ibid., chap, xii, sec. 13. 
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It is to be noted that this method applies only for the elimination of 
rectilinear trends. If it is desired to correct also for so-called seasonal 
variation before examining correlation, the computation of the partial 
correlation coefficient will not automatically effect that correction; 
but even in this case it will eliminate the rectilinear factors. 

III. EFFECT OF LAG 

It remains to inquire into the influence of lag upon the conclusion of 
Section II. Suppose that y lags behind x by an amount L: that t' it the 
value of t associated with y t , is L less than £<, the value of t associated 
with a;,-. The residuals are, on this assumption, 

& = ^ — bjf — a x , t\i = Vi — byt'i — a v 
and the product sum takes the form: 

Np = 2&j< = 2(x,- - b x ti - a x ) (y { - &/< - a y ) 
= 2 (a;,— a x ) {yi-a v ) - b x b y '2t i t'i 
= 2(x,-X)(2/,- Y)-b x b y (2t i t' i -NTT') 
= 2(av-X)( 2 /,- F) -6AS&- W*-2") 
= {r xv -r xt r v ^ u )N(T x (i v 
giving 

r iv = /^-WV . (7 ) 

V l-r* xt VI -rV 

It is evident that the value given by (7) is not exactly the partial 
correlation coefficient, but differs from it by the multiplier r u > in the 
second term of the numerator. Nevertheless, the calculation of the 
lagging correlation between & and % is facilitated by the use of (7) 
rather than the laborious method of correcting the original items. As 
the ti and t' f differ ordinarily by a constant lag, r u ' will generally be 
unity in the cases arising in practice. In fact, we have 

t'i=U-L 
and, if the lag is constant, the arithmetic means T, T' will be related by 

T' = T-L. 
Hence quantities of the type <',- — T' reduce to << — T, and the numerator 
and denominator of r u ' become identical : the coefficient of correlation 
between t and t' reduces to unity, as might have been foreseen, since the 
relation between the two t series is perfect when L is constant. 

Therefore formula (6) and the conclusion of Section II still hold 
when there is lag, provided the lag is constant. This will cover all the 
simpler cases arising in practice. Should it be desirable, however, to 
compare two series with variable lag, formula (7) can be used; and the 
calculation of the coefficient r tt ' will not greatly complicate the work. 



